ABA Registration Pipeline

Accompanies the paper entitled "Laminar and Dorsoventral Molecular Organization of the Medial Entorhinal Cortex Revealed by Large-scale Anatomical Analysis of Gene Expression"

Helen L. Ramsden, Gülşen Sürmeli, Steven G. McDonagh and Matthew F. Nolan

Notes:

This Ipython notebook contains the workflow for registering images using the ABA. A summary of this workflow is shown in the table below.

Each code block should be able to be run independently in a particular session, though of course some depend on others having been run at some point in the past (such as image download).

Some elements were run using Matlab, other scripts call ImageJ (where indicated) so you will need to have these installed on your computer, and you will also need to specify the path to them.

The only stage that is very difficult to run on a desktop computer is group registration.

The Example_Results folder contains the results of an example run with just 10 images. For this run we started with a list of genes of interest and our group-registered images for each section.

Please email helenlramsden+aba AT gmail DOT COM if you have any queries.

Disclaimer

This code has been edited from code written as a one-off to download and register images. We are making it available to be both transparent and helpful, but do not necessarily advocate taking the identical approach. In some places we have made suggestions as to possible improvements that we didn't implement. We cannot guarantee that it will work on any system. Additionally, we have had to make edits to the code as our original versions were set up to be run on a university cluster.

Licence

Copyright (c) 2014, Helen Ramsden All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


In [10]:
from IPython.display import Image
Image(filename='RelevantFigures/SummaryTable.png')


Out[10]:

Outline:

a. Download images from the ABA
b. Preprocessing
c. Cerebellar segmentation
d. Group registration (Matlab)
e. Med-lat classification (Matlab + Manual)
f. 1-to-1 Image registration (Matlab)
g. Apply registration transformation to raw and expression images (Matlab)
h. Image quality check

A) Image Download from the ABA

Parameters: (line in )


In [10]:
from ABAFunctions.ABA_DownloadImages import downloadjpgbylistnew
downloadjpgbylistnew('Example_Results/Images/', 'Example_Results/Images/', 'Example')


0 75457489
http://www.brain-map.org/aba/api/imageseries/75457489.xml
1 77620754
http://www.brain-map.org/aba/api/imageseries/77620754.xml
2 73513814
http://www.brain-map.org/aba/api/imageseries/73513814.xml
3 75496134
http://www.brain-map.org/aba/api/imageseries/75496134.xml
4 69838649
http://www.brain-map.org/aba/api/imageseries/69838649.xml
5 74735349
http://www.brain-map.org/aba/api/imageseries/74735349.xml
6 71307226
http://www.brain-map.org/aba/api/imageseries/71307226.xml
7 74579449
http://www.brain-map.org/aba/api/imageseries/74579449.xml
8 74357601
http://www.brain-map.org/aba/api/imageseries/74357601.xml
9 70523811
http://www.brain-map.org/aba/api/imageseries/70523811.xml

And resize (parameter: resizefactor = 1.25)


In [2]:
from ABAFunctions.ABA_ChooseImProcFunction import preprocess_pipeline
import ABAFunctions.ABA_settings as sett

In [18]:
infolderpath = sett.homepath + 'Original_ish/'
filelist = sett.homepath+ sett.prefix + 'filelist.txt'
#!ls $infolderpath | grep $sett.filesearch > $filelist
preprocess_pipeline('', infolderpath, '','', sett.filesearch, sett.macropath, 'resizeimages', sett.prefix,
                     filelist, '')

B) Image Preprocessing & Cerebellar Segmentation

Pre-process - ABA_imageprocessing

Parameters:
1) Resize to: 80%
2) Subtract background: 1 pixel
3) Threshold: Auto (Min error)
4) Edge detection smoothing: 10 pixels
5) create segmentation mask threshold: auto (Li)
Uses Python and ImageJ

Subfolders in place in case there's a very large number of images - not used for this example


In [3]:
from ABAFunctions.ABA_ChooseImProcFunction import preprocess_pipeline
import ABAFunctions.ABA_settings as sett

In [7]:
'''
Define the functions to be run
For each function, the output folder, input folder, type of file to look for, target file end
and type of ImageJ run should be specified 
Procedure: Subtract background --> segment the cerebellum from the ISH, processed and expression images
--> threshold the forebrain processed image
'''

runtypes= ['sbimages','SegmentCerebp1' + sett.groupreg + '_'+ sett.thresh,'SegmentCerebp2' + sett.groupreg + '_'+ sett.thresh,'threshimages',
           'SegmentCerebp3' + sett.groupreg + '_'+ sett.thresh]
outfilefolder = ['SB/','Segmented/','Segmented/','Thresh/','Segmented/']
infilefolder = ['Resize_ish/','SB/','Edges/','SB/','Segmented/SegmentedMask/']
filesearch = ['ishfull','ishfull_sb','ishfull_sb_edges','ishfull_sb','ishfull_sb_mask']
targetend = ['','_sb','_sb_seg','sb_seg','_sb_seg']
headless = [' --headless','','','','']

n =0
for count,runtype in enumerate(runtypes[n:]): #n+1

    count+=n
    runtype += '_codeonly'
    for s in sett.subfolders:
        layerlist = sett.homepath + 'imagej.txt'
        filelist = sett.homepath + 'filelist.txt'
        infolderpath = sett.fullhomepath + infilefolder[count] + s
        outfolderpath = sett.fullhomepath + outfilefolder[count] + s

        preprocess_pipeline('',infolderpath, outfolderpath, '', filesearch[count], sett.macropath,
                            runtype, sett.prefix, filelist,headless[count])
    
    fiji = '/Applications/Fiji.app/Contents/MacOS/fiji-macosx '
    macrofile = sett.macropath + runtype + '.ijm'
    ! $fiji $macrofile $headless[count]


/Users/helenlramsden/Dropbox/Projects/Ramsden_MECgeneexpression/Example_Results/Images/Segmented/SegmentedExp/ sb_mask 66
13
74579449_LOC433525_17_58ishfull_sb_mask 13 70523811_Rnf25_22_63ishfull_sb_mask 74579449_LOC433525_17_58ishfull_sb_mask
Files to be processed:  53
53
Oct 12 22:58:07 vpn2-248.vpn.net.ed.ac.uk fiji-macosx[37639] <Error>: The function `CGContextErase' is obsolete and will be removed in an upcoming update. Unfortunately, this application, or a library it uses, is using this obsolete function, and is thereby contributing to an overall degradation of system performance.

Check cerebellar segmentation

(1) Simple option - find the centre of mass of masks (checksegmented). Anything outside of the normal region may have problems with registration


In [8]:
from ABAFunctions.ABA_imageprocessing import getallfiledict
from ABAFunctions.ABA_errors import checksegmented
import ABAFunctions.ABA_settings as sett

filelist = sett.fullhomepath + sett.prefix + 'filelist.txt'
allfilesdict,_ = getallfiledict(sett.fullhomepath +'Segmented/SegmentedMask/', filelist, sett.filesearch)
checksegmented(sett.fullhomepath +'Segmented/SegmentedMask/',allfilesdict,sett.fullhomepath +'RegResults/')


/Users/helenlramsden/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scipy/ndimage/measurements.py:1277: RuntimeWarning: invalid value encountered in double_scalars
  for dir in range(input.ndim)]

(2) Write a classifier to distinguish between good and poor segmentation masks (see files used in Matlab/MLclassification/).

C) Group registration (Matlab)

First, mark landmarks on each image using ImageJ - save positions of these landmarks as 'Positions.txt' using ImageJ's Add to Manager. Note: the positions given in the example are just for demo purposes and may not be correct.
Run the alignment macro:


In [48]:
from ABAFunctions.ABA_ChooseImProcFunction import preprocess_pipeline
import ABAFunctions.ABA_settings as sett

In [11]:
for sc,s in enumerate(sett.slices):
    infolderpath = sett.fullhomepath +sett.filefolders[sc]+ 'GroupRegistration/Images_prealign/'
    outfolderpath = sett.fullhomepath + sett.filefolders[sc] +'GroupRegistration/Images_postalign/'
    filelist = sett.fullhomepath + sett.slices[sc] + sett.prefix + 'filelist.txt'
    preprocess_pipeline(s, infolderpath, outfolderpath,sett.chosentargets[sc], sett.filesearch, sett.macropath, 'alignmacrocodeonly', sett.prefix,
                         filelist, '')


2068_Brp44l_17_2463ishfull_sb_seg_sbthres.jpg
4760  3744
70918898_Vdac1_5ishfull_sb_seg_sbthres.jpg
3534  3096
68203457_Gnb1_30ishfull_sb_seg_sbthres.jpg
3510  2856
68844086_Nfasc_22ishfull_sb_seg_sbthres.jpg
3540  2778
70231011_Arhgef9_28ishfull_sb_seg_sbthres.jpg
3552  3036
71656802_Map3k12_14ishfull_sb_seg_sbthres.jpg
3714  3150
77454574_Pgrmc1_25ishfull_sb_seg_sbthres.jpg
3708  3078
71381076_Map2k1_26ishfull_sb_seg_sbthres.jpg
3720  3018
77454574_Pgrmc1_25ishfull_sb_seg_sbthres
70231011_Arhgef9_28ishfull_sb_seg_sbthres
71656802_Map3k12_14ishfull_sb_seg_sbthres
68844086_Nfasc_22ishfull_sb_seg_sbthres
70918898_Vdac1_5ishfull_sb_seg_sbthres
2068_Brp44l_17_2463ishfull_sb_seg_sbthres
68203457_Gnb1_30ishfull_sb_seg_sbthres

In [13]:
fiji = '/Applications/Fiji.app/Contents/MacOS/fiji-macosx '
macrofile = sett.macropath + 'alignmacrocodeonly.ijm'
! $fiji $macrofile


Sep 29 17:46:37 helen-ramsdens-macbook-pro.local fiji-macosx[8754] <Error>: The function `CGContextErase' is obsolete and will be removed in an upcoming update. Unfortunately, this application, or a library it uses, is using this obsolete function, and is thereby contributing to an overall degradation of system performance.

Remove borders around images using Matlab/Registration/Functions/imstack_removeBG.m

Now group register aligned images

Download MIRT from https://sites.google.com/site/myronenko/research/mirt
See Matlab/Registration/Functions/run_group_registration_ecdf.m

Parameters:
1) Gaussian blur * 3, window 13, sd 3
2) Adjust contrast

For the demo, we've included the median images we used.

D) Med-lat classification (Matlab + Manual)

There are two options:

1) ABA images - use the referenceatlasindex number in the image xml file (this is the number preceding ishfull in the filename). The demo shows how to classify using this value. Bear in mind that the referenceatlas index value may not be accurate.

2) For non-ABA images and to check ML classification, use SVM classification or equivalent. We have included parts of the code and the overall used to run this.

(1) Referenceatlasindex

This needs to be run in ipython console because os.system within script may not work


In [9]:
from ABAFunctions.ABA_MLclassification import sort_ml_images
import ABAFunctions.ABA_settings as sett

filelist = sett.homepath + sett.prefix + 'filelist.txt'
sort_ml_images(sett.homepath,filelist,sett.filesearch)

(2) SVM classification

See code in Matlab/MLclassification
1) Download the SVM image classification package from http://www.robots.ox.ac.uk/~vgg/share/practical-image-classification.htm (see VedaldiFunctions)
2) Look through a subset of images (training data) and label the image closest to the central reference image with a 1. Label all other images with a -1.
3) Randomly separate the training data into train and test sets.
4) Extract features from the images using provided code (see classifyimages in MLclassification)
4) Train the SVM model and optimise the hyperparameters of the SVM
5) Check performance on the test set
6) Apply the SVM model to all other images. The result should be saved in a Log file: Column 1: filename, Column 2: score
6) Parse the log file and for each iseries extract the position id of the image with the highest score. Save a file called ML3loc.txt containing 2 columns: (1) Iseries, (2) position id with max score.
7) Copy the files into different folders: Run the script useSVMclassification instead of userefatlasindex in ABA_MLclassification.py.
8) Check the NoML folder for any wrongly classified images. Delete these. Remake a list containing the correct position ids of the central image and remove files.

Run Matlab like this:


In [ ]:
matlabpath = 'matlab' # path to matlab executable
matlabscript = 'Matlab/**.m'
! $matlabpath $matlabscript

E) 1-1 image registration

Pre-process - ABA_imageprocessing

Parameters:
1) Gaussian blur * 3, window 13, sd 3
2) Adjust contrast


In [ ]:
matlabpath = 'matlab'
matlabscript = 'Matlab/Registration/run_1_preprocess.m'
! $matlabpath $matlabscript

Register - folder_reg, im_reg

Parameters:
1) Regularisation, lambda: 0.1 (default 0.01) 2) Similarity measure = mutual information


In [ ]:
matlabpath = 'matlab'
matlabscript = 'Matlab/Registration/run_2_reg.m'
! $matlabpath $matlabscript

Collect log files so that you can validate MI scores


In [ ]:
from ABA_parselogfiles import parselogfiles
parselogfiles()

F) Apply registration transformation

This applies the transformation to both the original, resized ISH image and the resized expression image.


In [ ]:
import os
matlabpath = 'matlab'
matlabscript = 'Matlab/Registration/run_3_transforms.m'
! $matlabpath $matlabscript

G) Image Quality Check

(1) MI log

Plot the distribution of MI scores


In [ ]:
from ABA_errors import plotmi
plotmi()

Pseudo code showing how potential errors in mutual information score were saved


In [ ]:
midata = dict((re.sub('_sbthres_seg_blur','',line.strip().split('\t')[0]),line.strip().split('\t')[2:5]) 
              for line in open('alllogdata.txt')) # filename, mi log data, just ML3 atm
'''
'''
for f in sorted(filenames): # all files registered
    '''
    '''
    errorlog = ''
    if f in midata.keys():
        mivals = midata[f]
        if float(mivals[1])> -400: errorlog+='-poorMI'
        elif float(mivals[1])> -300: errorlog+='-suspectMI'
    else: mivals = ['','','']

(2) Cross correlation

Run run_4_crosscorr.m - this generates a log file containing details of the offsets between the reference image and image in question. Anything with an offset greater than 10 pixels was flagged

(3) SVM classification - see Quality/trainclassifier_qual.m

See also classification for medio-lateral extent
Images were flagged in they had a probability of being an error > 0.13 - this score is highly dependent on the particular training set and may be different across medio-lateral groups


In [ ]: